Mining a Year of Speech
نویسندگان
چکیده
The availability of large text corpora has revolutionized linguistics and is of great value in many other areas of scholarship. Our “Mining a Year of Speech” project, funded by the transatlantic “Digging into Data” competition, aims to do the same for spoken language. We present a new generation of speech corpora, characterised by aggregation of datasets, annotated using forced alignment and exposed for public use in standard formats across multiple sites.
منابع مشابه
Feature extraction in opinion mining through Persian reviews
Opinion mining deals with an analysis of user reviews for extracting their opinions, sentiments and demands in a specific area, which can play an important role in making major decisions in such area. In general, opinion mining extracts user reviews at three levels of document, sentence and feature. Opinion mining at the feature level is taken into consideration more than the other two levels d...
متن کاملTrends in Speech and Language Rehabilitation in Iran
This paper is a short review on the Jann and content of speech and language rehabilitation services and the trend of their institutionalization in Iran. A summary of formal education in speech and language therapy in Iran as originated by establishing a 4 year BS rehabilitation program in the College of Rehabilitation Sciences in Tehran in 1974 is given. Since then, speech and language Rehabili...
متن کاملSpeech development and auditory performance in children after cochlear implantation
Abstract Background: The aim of this study was to determine the auditory performance of congenitally deaf children and the effect of cochlear implantation (CI) on speech intelligibility. Methods: Aprospective study was undertaken on 47 children in a pediatric tertiary referral center for CI. All children were deaf prelingually and were younger than 8 years of age. They were followed up until 5...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کامل